Autonomous Instruction Memory Equipped with Dynamic Branch Handling Capability
نویسندگان
چکیده
portable information appliances, the extraordinary power consumption ratio of memory accesses promotes importance of efficient memory system design to an ultimate. We address the following issues: how to minimize memory bandwidth requirement for instruction accesses, and how to minimize memory access delay, again for instruction accesses. Then we propose to move dynamic branch handler (e.g., branch target buffer) from CPU to the instruction memory side (i.e., bind BTB and instruction memory in the same bus module). We present such a design which can help us achieve the following goals: 1. To greatly reduce instruction address bus traffic, which saves much bus power; 2. To autonomously pipe instructions to the CPU, without having to wait for the addresses from CPU, which saves much delay. Using programs in MediaBench as test programs, our design shows that the instruction address bus traffic reduction is 83%, even by accounting for repeated and redundant target address traffic in instruction decode stage. Furthermore, we suggest using very simple techniques to eliminate transmission of most of these target addresses.
منابع مشابه
A Decoupled Fetch-Execute Engine with Static Branch Prediction Support
We describe a method for supporting static branch prediction on a decoupled fetch-execute pipeline. Using instruction buffers to decouple instruction fetch from the execute pipeline is an effective way to minimize instruction cache penalties by allowing instruction fetch and stall miss handling to proceed independent of the execution pipeline. Dynamic branch prediction is typically used with su...
متن کاملThe Effective Way of Processor Performance Enhancement by Proper Branch Handling
The processor performance is highly dependent on the regular supply of correct instruction at the right time. To reduce instruction cache misses, one of the proposed mechanism is the instruction prefetching, which in turn will increase instructions supply to the processor. The technology developments in these fields indicates that in future the gap between processing speeds of processor and dat...
متن کاملVliw Processors: Efficiently Exploiting Instruction Level Parallelism a Dissertation Submitted to the Department of Electrical Engineering and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
This dissertation explores high-performance complexity-efficient processors focusing on VLIW processors. Complexity efficiency is a qualitative characteristic that describes a system where performance has not reached the point of diminishing returns. Using the techniques described in this dissertation, simple statically-scheduled very-long-instructionword (VLIW) processors can be efficient arch...
متن کاملImproving the Dynamic Performance of Passenger Cars via Rear Suspension Mechanism Modification
This paper presents the results of a recent project of IKCo’s research center to modify Paykan 1600’s rear suspension mechanism with the purpose of improving comfort, stability and handling qualities. The car was originally equipped with a solid rear axle with leaf springs. By replacing the original mechanism with a three-link mechanism with panhard bar and coil springs, the ride comfort and ha...
متن کاملUnderstanding the Backward Slices of Performance Degrading Instructions Craig
For many applications, branch mispredictions and cache misses limit a processor's performance to a level well below its peak instruction throughput. A small fraction of static instructions, whose behavior cannot be anticipated using current branch predictors and caches, contribute a large fraction of such performance degrading events. This paper analyzes the dynamic instruction stream leading u...
متن کامل